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Abstract 

A fast method is proposed for solving the high frequency Helmholtz 
equation. The building block of the new fast method is an overlapping 
source transfer domain decomposition method for layered medium, 
which is an extension of the source transfer domain decomposition 
method proposed by Chen and Xiang mE]. The new fast method con¬ 
tains a setup phase and a solving phase. In the setup phase, the com¬ 
putation domain is decomposed hierarchically into many subdomains 
of different levels, and the mapping from incident traces to field traces 
on all the subdomains are set up bottom-up. In the solving phase, 
first on the bottom level, the local problem on the subdomains with 
restricted source is solved, then the wave propagates on the boundaries 
of all the subdomains bottom-up, at last the local solutions on all the 
subdomains are summed up top-down. The total computation cost of 
the new fast method is 0(n2 logn) for 2D problem. Numerical experi¬ 
ments shows that with the new fast method, Helmholtz equations with 
half billion unknowns could be solved efficiently on massively parallel 
machines. 

Key words. Helmholtz equation, fast method, domain decompo¬ 
sition method, PML. 
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1 Introduction 


We consider in this paper to solve the Helmholtz equation in the full 
space with Sommerfeld radiation condition, 

Au + k^u = / in 

^ 1 / 2 /^ _ g as r = |x| —)■ oo 

or 

where k is the wave number. 

Many domain decomposition method has recently been developed 
to solve the Helmholtz equation, most of them are non-overlapped, 
and the major differences are the interface conditions. Engquist and 
Ying mm proposed a sweeping preconditioner by approximating 
the inverse of Schur complements in the LDL* factorization, Stolk 
m proposed a domain decomposition method with a transmission 
condition based on perfect matched layers, Vion an Geuzaine mi pro¬ 
posed a double sweep preconditioner that use a transmission condition 
that involves Dirichlet-to-Neumann (DtN) operator, Zepeda |15j intro¬ 
duced the method of polarized trace that use a transmission condition 
in boundary integral form, Liu and Ying |12j developed an additive 
sweeping preconditioner that use a transmission condition built with 
the boundary values of the intermediate wave directly. Chen and Xi¬ 
ang mm proposed the source transfer domain decomposition method 
that transfer the source in subdomains, and recently Du and Wu [8] 
improved the method so that the transfer applies in both directions. 

The domain decomposition method in the literature usually ap¬ 
proximately solves the Helmholtz equation with varying medium, ei¬ 
ther with approximated interface condition or with approximated Green 
function, thus they are commonly used as preconditioners for Krylov 
subspace method such as GMRES. 

An overlapping source transfer domain decomposition method is 
proposed for Helmholtz equation with layered medium, the method 
follows the natural wave traveling process in layered medium, which 
involves the reflections and refractions at the interface of the layers. 
The convergence of the new domain decomposition method is ensured 
by the overlapping region, and the accuracy of the new domain decom¬ 
position method makes it the building block of the new fast method. 

The domain decomposition method suffers from slow convergence 
rate when the number of subdomains is large, thus multilevel grid is 
needed so that the information is brought to far away subdomains 


( 1 ) 


2 


without passing the subdomains on the way. The upper level grid for 
Poisson type problem could be coarser since the amount of information 
decreases fast as the distance grows. However, for Helmholtz equation, 
the grid size should be maintained small to represent wave shapes 
on the upper level grid. Fortunately, the trace on the subdomain 
boundaries could be used to represent the solution on the subdomain, 
thus the computation cost on upper level grid is not formidable. 

The fast method we proposed first setup the trace mapping on 
subdomains of different levels. Then the sources are converted to 
traces on the bottom level, and propagate on higher and higher levels 
till the top level, then the traces on high levels are decomposed into 
traces on lower and lower level, at last the traces in the bottom level 
is converted back to solutions and summed up. In such up and down 
process, the wave travels to far away regions via the traces on high 
levels. 

The rest of the paper is organized as follows. In section 2, an 
overlapping source transfer domain decomposition method is proposed 
for Helmholtz equation with layered medium. In section 3, the fast 
algorithm is described. The multilevel domain decomposition with 
quadtree structure is built, and the algorithm to build incident trace 
to held trace mapping on subdomains is proposed, then source up and 
solution down algorithm are proposed. The numerical experiment for 
Marmousi model is present in section 4. 


2 The overlapping source transfer DDM 

The foundation of the fast method is the overlapping source transfer 
domain decomposition method for the Helmholtz equation. We hrst 
propose and analyze the overlapping STDDM for Helmholtz problem 
with three layered medium, then revise the method and substitute the 
solving of subdomain problem into mapping, and at last propose the 
overlapping decomposition method for four subdomains, which is the 
building block of the fast propagation method. 
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2.1 STDDM in three layered medium 

Consider the Helmholtz equation 0 defined in , where the source / 
is given, and the wave number k is different in three horizontal layers, 


Kv) 


fci, if y < —d 

k 2 , if —d < y < d 

/c 3 , if y>d 


( 2 ) 


as shown in Fig[2 The upper interface y = d is denoted Ti, and the 
lower interface y = —d is denoted r 2 . 
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Figure 1: Domain decomposition in y direction for three layered problem. 

The frequence domain wave equations defined on unbounded do¬ 
main could be solved on truncated domain with the perfect matched 
layer as the absorbing boundary condition PE]. To solve Helmholtz 
problem Q, the unbounded domain is truncated to a rectangle 
[—h,h] X [—h^h], with a PML layer of length attached to the 
boundary, and the trucated domain H becomes [—h — Ipmi, h + ^pmi] x 
[—I 2 — Ipmi, h + ^pmi]- We refer the domain without PML layer as the 
interior of domain H, denoted H. For simplicity , we denote —li — Ipmi 
as li- 

The uniaxial PML method [6| is used in this paper, where the 
complex coordinate is streched in x and y direction sperately, Xj {xj) = 

/ aj{t)dt, j = 1,2, and the medium perporty is chosen that crj{t) = 

Jo 

0 for |t| < Ij, and crj{t) > 0 in PML layer \t\ > Ij. Then the PML 
equation is 

J-^V • (HVm) + k'^u = f, in Q, (3) 
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where A{x) = diag ( ^ 7 —w, , and J{x) = ai{xi)a 2 {x 2 )■ 

Vai(xi) a2[x2)J 

The computation domain is decomposed to two overlapping sub- 
domains, the upper one = [—x [—1 — Zpmi ,^2 + ^pmi] and the 
lower one ^2 = [—^ 1 ,^ 1 ] x [—h — +/Dmi]) with an overlapping 

region [—TijTi] x [—1,1], as is shown in Fig lb Similar PML equations 
as ([^ are built on the two subdomains, and the parameter A and J in 
the PML equation are denoted Ai and Jj for subdomain flj, i = 1, 2. 

The new domain decomposition method first solve the subdomain 
problem with the restricted source. 


J^V-{AiVui) + k^Ui = fi, in rLi,i = 1,2 


(4) 


where /i = / • Xy<o for and /2 = / • Xy>o for ^ 2 , and the solution 
is denoted u? for i = 1 , 2 . 

Then, the wave field in hli is transfered as source to ^2 meanwhile 
the wave held in 122 is transfered as source to 12 i, with the new trans¬ 
fered sources the PML equation on the subdomains is solved and new 
wave held is generated, and so on. 


Jf • (AiV<+^) + = ^i(u|). 

in rii 

(5) 

Ti(u|) = -Jf • (AiVu|) - k'^ul. 

in rii 


■ {A2Vu^+^) + k^U^+^ = ^2K), 

in 122 

(6) 

= -J2^^ ■ {A2S7uI) - eul. 

in 122 



where Ti and T 2 are the source transfer function, s is the iteration 
step, s = 0,1,2,... Note that the transfered source 'I'i(m 2 ) = 0 for 
y<l 01 y>l + Zpmi , thus it has a compact support in the PML 
layer, so does At last, the PML solutions on subdomains are 

summed up as the solution obtained by the domain decomposition 
method, 

00 

^tddm = 

s=0 

Although the PML equation Q-Q sovles the truncated Helmholtz 
equation in the subdomain approximately, the convergence of the se¬ 
ries Q to the solution of Q could be shown by 


5 




N 


^ + ^2)J “ / 

: L('u 5 + U2) ~ f + ^ ~*~ ^2)^ 


N 


'^(^1) ~ '^(^2) + I^(^l + ^2) + ^ ^2) 

\s=2 / 

/ N N 

^(«i) - ^(^2) + L(m? + ^2) + L I + U2) 


\s=3 


— i-i{ui + U 2 ), 

and the remaining term L(tt^ + U 2 ) —>• 0 as —)■ cx), which could be 
ensured by the convergence of the PML method [3] together with the 
analysis of wave traveling in layered medium as follows. 

The solution of the domain decomposition method in the form of 
0 could be interpreted as the superposition of the incident waves, 
reflected waves and refracted waves that propagate in the layers [7], 
as is illustrated in Fig[^ 



Figure 2: Wave traveling in three layered medium. 

Suppose the incident wave Uq comes from the upper layer, then 
at interface Ti, Fq causes a reflected wave Uqq going upwards in the 
upper layer and a refracted wave Foi going downwards in the middle 
layer. The wave Uq + Uqq + Uqi is approximately the solution of 
the subdomain equation Q with i = 1 . 

Then at interface r 2 , Foi causes a reflected wave 17oi2 going down¬ 
wards in the lower layer and a refracted wave Uou going upwards in 
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the middle layer. The wave Uqi + Uqi 2 + Uqu is approximately the 
solution u\ of the subdomain equation Q. 

Then at interface Ti, [/qh causes a reflected wave C/quo going up¬ 
wards in the upper layer and a refracted wave ?7oiii going downwards 
in the middle layer. The wave ?7oii + t^oiio + b^oiii is approximately 
the solution u\ of the subdomain equation ([^. The traveling process 
goes on, and the superposition of all the waves is the solution to (§, 

U = Uo + Uqo + C^Ol + t^012 + t^Oll 

+ bioilO + + b^01112 + b^Ollll + • • • (8) 

and the series Q is approximately the series 0. 

The convergence of the new overlapping domain decomposition 
method related closely to the medium perporty of the layers and the 
size of the overlapping region. When the overlapping region of the 
subdomains lies inside the middle layer of the three, e.g., I < d, the 
convergence rate of the domain decomposition method is at most the 
convergence rate of the series Q. The worst case happens when there 
is a narrow wave guide, and the overlapping domain lies inside the 
wave guide, e.g. k 2 > ki = k^, I < d and d is small. To avoid such 
cases, the overlapping region should have a non-zero minimum size. 

The overlapping region ensures the convergence of the new do¬ 
main decomposition method for layered medium. The convergence 
of non-overlaping DDM might deteriorate if the subdomain interface 
lies right in a waveguide. We have two remarks on the new domain 
decomposition method. 

Remark 1: The convergence of the solution enables direct solv¬ 
ing the Helmholtz equation with the method, rather than use it as a 
preconditioner, which is crucial for our new fast method. 

Remark 2: An extend PML layer could be defined that it includes 
a PML layer and a layer that doesn’t absorb at all, for example, the 
layer [—/i, ^i] x [0, /-|-/pmi] is an extend PML layer. Since it’s all about 
the PML layer parameters, we do not make a distinction between the 
two and simply call them the PML layer. 

2.2 Mapping instead of solving 

The domain decomposition method in the above subsection could be 
revised that the solving of PML equation on subdomains Q-Q is 
substituted by mapping. 
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For subdomain Oi, a mapping Qi from incidents trace on the 
line [—Zi,/i] X 0 to the wave solution u in is defined as follows: 
Given on the line [— h] x 0 , solve u as its extension such that 

J2"^V • {A2VU) + k^u = 0 , in [-hji] x [ 0 ,[+ /pmi] ( 9 ) 
u = U^, on [—rijTiJxO (10) 

It’s obvious that if is the trace of a solution to Q, then u is the 
restriction of that solution on the region [—li,li] x [0,/ + ^pmi]- The 
extension u is then transfered as source, 

'I'i('u) = —in rii, (11) 


with which the wave field solution u to PML equation in subomain lli 
is solved 

• (AiVu) + ='I'i(u), in rii. (12) 

The mapping is then defined as u = Qi{U^). 

Another mapping Fi from incidents trace on the line [— li] x 0 
to the field trace on the same line, is defined by = Fi{U^) = 
("Oxo ■ ^hhough both the incident trace and the field trace 
is on the line [—x 0 , it is referred as incident boundary or field 
boundary, respectfully. For subdomain 112) similar mapping Q2 and 
T2 could be defined. 

Now the domain decomposition method for Helmholtz equation 
with three layered medium could be revised as follows: first, solve the 
subdomain problem with the restricted source, 

• (AjVttj) +=/j, in Hi, z = 1,2 ( 13 ) 


where /i = / • Xy<Q for ^i) a-ird /2 = / • Xy>o for ^^2: the solution 
is denoted for i = 1,2, and the field trace of the solutions are 

Then each subdomain takes its neighbor’s field trace as its own 
incident trace, map the incident trace to filed trace, and so on. 


= f/J’" in Hi 

= jr^([/fo+i) in Hi 
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in H2 
in H2 


( 14 ) 


for s = 0,l,2,..., and the domain decomposition solution is 


?^DDM = + ^2 + 



+ Q2 



( 15 ) 


2.3 STDDM with four subdomains 


The above domain decomposition method with two subdomain in y 
direction could be easily extended to four subdomains in both x and y 
directions. The major difference is that the incident boundaries, field 
boundaries and their source tranfer regions are a little complicated for 
four subdomains. 



Figure 3: Domain decomposition with four subdomains. The hatched area 
is the PML layer, the shaddowed area is the source transfer region, the thick 
lines are the incident or held boundaries, (a) four subdomain’s interior region 
Djj, = 1,2 and the PML layer of total domain, (b-d) incident bound¬ 
aries and corresponding source transfer region of subdomain 0 . 2 , 2 - (e-g) held 
boundaries and corresponding source transfer region of subdomain 0 . 2 ^ 2 - 

The total domain D is decomposed into four smaller subdomains 
Djj, i,j = 1 , 2 . The interior (region without PML layer) of the subdo¬ 
main Dij are denoted Djj, they are non-overlapped and their union 
is the interior of the total domain, as is shown in Fig [^-(a). Each 
subdomain Djj has its PML layer lie in its neighbors. 
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There are three kind of incident boundaries, denoted TJ and three 
kind of field boundaries, denoted Tf^-, for subdomain as in Fig 
[^-(b-g). For examples, on subdomain 0 . 2 , 2 , the incident boundary for 
wave comes from subdomain is shown in Fig[^-(b), and the field 
boundary for wave goes to subdomain is shown in Fig[^-(e). 

The incident traces on boundary F^^- are denoted as Ujj, and the 
field traces on boundary Ff^- are denoted as Ufj. The mapping from 
the incident trace to the solution on subdomain Oij is denoted Gij, 
while the the mapping from the incident trace to the field trace on 
subdomain Oij is denoted Tij 

The domain decomposition method with four subdomains is shown 
in Algorithmic In the algorithm, the wave propagates between chil¬ 
dren subdomains via the iteration ( 3 - 7 ), we call it the iteration of 
incident and field traces from now on. 


Algorithm 1 Domain decomposition with four subdomains. 


1 : Solve the mapping on subdomain Djj, i,j = 1,2, 
with direct solver. 

2 : Solve the local problem on Oij with source fij = f\Oij, 
restrict the solution u^j to field trace 
3 : while Ei,j=i,2 > e do 

4 : Send subdomain Djj’s field trace 

to its siblings Oi/ji as incident trace 
5 : Record the incident traces 

^iJ 

6 : Map the incidents trace to field trace 

7 : Set s = s -I- 1 

8: end while 

9 : Solve the local problem on Djj with the summation of incident 


traces using direct solver, the solution is denoted 




10 : Sum up the solutions of all subdomains to get the total solution 


u = 



+ Q: 




^s>0 / 


10 







Figure 4: Hierarchical Domain decomposition with 4 levels, 
right, domain decomposition on level 1, 2, 3, 4. 


From left to 


3 The Fast Propagation Method 


3.1 Hierarchical domain decomposition 

A rectangular domain of [—A, L] is decomposed into smaller rectan¬ 
gular blocks (or subdomain) on different levels. Denote the number 
of levels as Nl + 1, the level / = 0 is referred as the bottom level 
and the level I = Nl is referred as the top level. The number of 
blocks in x direction at level I is where I = Let 

, on the level I, the block which is the Fth block in 
X direction and the j-th block in y direction, is denoted where 

i,j G Ii- Each block shares an overlapping PML layer region of length 
/pmi with its neighbors on the same level. 

The quadtree structure of the multiple level domain decomposition 
is built as follows. Each block ^ij-i on level I = ..., 1 has four 

children ^ 2 i, 2 j-i\i-i-, and ^ 2 i, 2 j-,i-i on level 

I — 1. For simplicity, the children of block is denoted 
where i' = 2i—1, 2i, j' = 2j —1, 2j. On the other hand, each block Djj;; 
on level I has a father ^\i/ 2 '\,\j/ 2 '\\i+i on level / +1, where I < N^. The 
father-son relationship of the blocks leads to the quadtree structure. 

The incident boundaries and field boundaries on block ^ij-i in¬ 
clude not only the boundaries between siblings as in Fig but also 
its ascendant’s incident boundaries and field boundaries, as is shown 
in Fig We call the boundaries as in Fig the corresponding in¬ 
cident and field boundaries between siblings. The incident boundary 
of block ^ij-i is denoted Tl^-.^, and the field boundary of block ^ij-i 
is denoted TD.^. We see T^j.; C and C 

The mapping from incident trace to solution on the block is denoted 
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Gij-h and the mapping from incident trace to field trace on the block 
is denoted ■ 



Figure 5 : Incident boundaries and field boundaries extension. The hatched 
area is the PML layer, the shaddowed area is the source transfer region, 
the thick lines are the incident or field boundaries, (a) four children block’s 
interior region and the PML layer of father block (b-f) ascendant’s 

incident boundaries and corresponding source transfer region on child Gl2i,2j-,i- 
(e) total field traces and corresponding source transfer region on child fl2i,2j-,i- 


3.2 Setup phase 

In the setup phase, the mapping from incident traces to field traces is 
constructed bottom up level by level. The mapping on block ff F ,i,j G 
Iq could be computed with external direct solver, and the mapping on 
block Glij-i of level I > 0 is computed as follows. 

Given an incident 6 lies in Pj j.; , it must lie in the incident bound¬ 
ary of one of the children, denoted as First the local problem 

on children ff with source 6 is considered, and the field trace of 

the solution on F^ j^ i-i is solved by mapping Then the field 

trace of is send to its siblings as incidents, and the iteration of 

incident and filed trace between siblings applies, and the incident trace 
in the iteration is denoted Uj, , where s is the iteration number. 
At last, field trace on caused by sum of incidents computed 
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with the mapping along with the field trace caused by S on 

are add up as the field trace Ufj.i on caused by 5, 




(16) 


(17) 


Ur,r,i = 

i,r,i j/j/ s 

and the mapping is 

The algorithm of building the mapping from incident traces to field 
traces is as follows. 

Algorithm 2 Build mapping of incident traces to field traces 

On level 0, build the mapping of incident to field with direct solver, 
for levels 1 = 1 ,..., Nl do 
On block Hij-i, 
for incident 5 lies in ^\j.i do 

Find the children such that 5 lies in 

On children 


7: 


9: 

10 

11 

12 

13 

14: 


map the incidents 6 to field trace 
and add part of them to father’s field trace 
Set children Qiojoil-U 


F 0 

and set = 0 on other children 

while > e do 

Send the children’s corresponding field trace t7v’v.,_i 
to its sibilings as incidens 

Map the incidents to field trace l/J’|/Ei = 

Set s = s + 1 

end while 

Map the sum of incidents to field trace on children, 
and add them to father’s field Ufj.i. 

end for 


I,S+1 N 


15: end for 


3.3 Solve phase 

With the mapping of incident traces to filed traces that is constructed 
on each block of all levels, the Helmholtz equation could be solved in 
two phases, the source-up phase and the the solution-down phase. 
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3.3.1 The Source-up phase 

In the source-up phase, the wave propagates on all levels bottom up 
as incident traces. 

The following problem is considered, for the block the local 

solution on its four children are known, e.g., u9, so does their 

F 0 

field traces how to solve the solution Uij-i on and its 

field trace Uf'-,. The iteration of incident and filed trace between 
siblings applies directly, denote the incident traces in the iteration as 
and the solution on fiij-i is 


i',j' \ s J 


and the the field trace of u\ , is 

^iJ 


(18) 



r^F,0 


pF 





(19) 


Review the procedure we found that to apply the procedure to 
next level, the incident to field mapping operation of children 

is needed, while the solving operation could be 

post processed. Apply the procedure from bottom level to top level 
leads to the following source-up algorithm. 


Algorithm 3 Source-up 

Input: Right hand side / of the linear system 
Output: Solution u^jo on 

and sum of incidents on on level I > 0 
1: On level I = 0, 

solve the local problem on with the source 

the solution ■ q and the its field trace q are recored. 

2: for levels 1 = 1 ,..., Nl do 
3: On block Hij-i, 

use the field trace Uj of the four childrens Hi/ji-i-i, 

add part of to t/f .;, 

set 

4: while > e do 

5: Send children’s corresponding field traces 
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to its sibilings as incidens 

6: Map the incidents to field 

7: Set s = s + 1 

8: end while 

9: Sum up the incidents Yls childrens 

and map them to the field trace 
then add to Ufj.i- 

10: end for 


The solution to the total problem could then be expressed as 


u = 


X] + X] X] X] 

i,jeio i>o i,jeii i',j' 


Qi'. 


d-i 


( y~! 

s / 


( 20 ) 


3.3.2 The Solution-down phase 


In the solution-down phase the wave propagates on all levels top down 
as incident traces. 

The solution (20) resulting from Algorithm still needs to solve 
the local Helmholtz problem with given incidents on blocks of different 
levels, fortunately, the local solutions could be break down to lower 
and lower level till level 0. We consider the following problem: on the 
block given the incidents Ujj.i, how to solve Gi,j-,i{Ujj-i). 

First the incident traces is divided into the incident traces on 


children U} 


then with the incident to field mapping 


^\3 


on each children, field trace of children is generated, e.g., 
U-, ’/./I, then the iteration of incident and filed trace between siblings 

applies, and the incident traces in the iteration is denoted as . 

At last the solution on Hjj.; is 



( 21 ) 


Apply the procedure from level I = Nl to I = since there are 
already sum of incidents Yls children blocks the 

incidents Yls / z-i from Gtij-i should be added on children. The 
algoritm is discribed as follows. 
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Algorithm 4 Solution down 

Input: Solution on 

and sum of incidents on on level I > 1 
Output: Solution u of the linear system 
1: for levels I = Nl, ..., 1 do 

2: On block divide the sum of incidents to its children, 






1-1 


* J 


3: 

4: 

5: 

6 : 

7: 

8 : 

9: 


Map the incidents to field 


VI,0 


while \\Ul;j,.i_i\ \ > £ do 

Send children’s corresponding held trace 
to its siblings as incidents 

Map the incidents to held 
Set s = s + 1 

end while 

Add the sum of incidents on children 


vh'f-i ■■= E +E 


10: end for 
11: On level I = 0, 

solve the local problem on with the incidents UIj.q, 
and add the solution to total solution u. 


Now the solution to the total problem is 


u = 



ij'fi 


+ Gi 


ij',0 


Ei'lJo) 


( 22 ) 


4 Numerical experiments 

The new method is tested on the 2D Marmousi model in seismology, 
which is 3, 000 m deep and 9,200 m wide. Only P-wave is considered, 
thus elastic wave equation becomes an acoustic equation. The velocity 
prohle is shown in Fig[^ the maximum velocity is 5500 km/s and the 
minmum velocity is 1500 km/s. 
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Figure 6 : Velocity profile of Marmousi model. The solution with = 5 is 
sampled in two small boxes in the figure. 
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Figure 7 : Real part of the solution with Nl = 5 in two small boxes as marked 

in Fig|^ 


Finite difference method with second order of accuracy is used to 
discretize the Helmholtz equation. The block size on bottom level is 
400 X 400, ant the PML layer is of 40 grid points width. Single shot 
in the corner of the domain at (400/ia;, 400/iy) is taken as the source, 
where hx, hy are the grid size in x and y direction, respectfully. The 


shape of the shot is an approximate delta function, fij 


1 

hxhy 


6{i - 


4:00hx,j — 400/ly). 

The fast propagation method is suitable for parallel computing, 
and could be easily extend to thousands of cores. We test the method 
with different grid levels and grid sizes on cluster jas listed in Table 


The tolerance of residual 


\\Ax-b \\2 


is 10 Fig 


shows the solution 
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Nl 

Size 

Freq 

ujj27r 

No. 

procs 

Time 

setup 

Time 

solve 

Time 

total 

1 

2,400 X 800 

37 

12 

40 

195 

235 

2 

4,800 X 1,600 

70 

48 

140 

205 

345 

3 

9,600 X 3,200 

137 

192 

333 

309 

642 

4 

19,200 X 6,400 

270 

768 

1212 

685 

1897 

5 

38,400 X 12,800 

537 

3,072 

2891 

883 

3774 


Table 1: Time cost (in seconds) of the new method. 


Nl 

Time 
Level 0 

Time 
Level 1 

Time 
Level 2 

Time 
Level 3 

Time 
Level 4 

1 

39.6 

- 

- 

- 

- 

2 

100 

40.1 

- 

- 

- 

3 

119 

86.2 

128 

- 

- 

4 

127 

74.7 

256 

754 

- 

5 

129 

98.7 

355 

716 

1592 


Table 2: Detailed setup phase time cost (in seconds). 


with iVi = 5 in two small boxes of 400 x 400 grid points as marked in 
Fig[§ 

The time cost of solving Helmholtz equation with the fast method 
in parallel is shown in Table The setup phase is the most demanding 
part in solving, since its complexity is log N). The detailed 

time cost in setup phase is shown in Table The mapping on the 
bottom block is solved with direct solver, e.g. MUMPS [I], and the 
time cost is almost constant, since the bottom level block is of fixed 
size. However, the time cost of building mapping on level Z + 1 is 
roughly twice of level I, where I > 0, which is time consuming for 
large Helmholtz problems. 


5 Conclusions 

A fast method is proposed for solving Helmholtz equations, the new 
method has a setup phase of complexity log N) and a solve 

phase of complexity 0{N log N). Our future work is to reduce the 
computation time of the new method by exploiting the low rank struc¬ 
ture of the mappings and accelerating dense matrix operations with 
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